Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition

نویسندگان

  • Yi Ren Leng
  • Tran Huy Dat
  • Norihide Kitaoka
  • Haizhou Li
چکیده

There are two issues when applying MFCC for sound event recognition: 1) sound events have a broader spectral range than speech thus the log-frequency scale is less informative; 2) low frequency noise is more prevalent thus the log-frequency scale captures more noise. To address these issues, we study two alternative frequency scales and show that they outperform MFCCs for sound event recognition under mismatch conditions using Support Vector Machines (SVMs) without the need for complex algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition

The proposed Missing Feature Linear-Frequency Cepstral Coefficients (MF-LFCC) is a noise robust cepstral feature that transforms both clean and noisy signals into a similar representation. Unlike conventional Missing Feature Techniques, the MF-LFCC does not require the substitution of spectrogram elements (imputation) or classifier modification (marginalization). To improve the noise mask used ...

متن کامل

Cepstrum derived from differentiated power spectrum for robust speech recognition

In this paper, cepstral features derived from the differential power spectrum (DPS) are proposed for improving the robustness of a speech recognizer in presence of background noise. These robust features are computed from the speech signal of a given frame through the following four steps. First, the short-time power spectrum of speech signal is computed from the speech signal through the fast ...

متن کامل

Lung sound classification using cepstral-based statistical features

Lung sounds convey useful information related to pulmonary pathology. In this paper, short-term spectral characteristics of lung sounds are studied to characterize the lung sounds for the identification of associated diseases. Motivated by the success of cepstral features in speech signal classification, we evaluate five different cepstral features to recognize three types of lung sounds: norma...

متن کامل

Selective gammatone filterbank feature for robust sound event recognition

This paper introduces a novel feature based on the raw output of the gammatone filterbank. Channel selection is used to enhance robustness over a range of signal-to-noise ratios (SNR) of additive noise. The recognition accuracy of the proposed feature is tested on a sound event database using a Hidden Markov Model (HMM) recogniser. A comparison with a series of similar features and the conventi...

متن کامل

Perceptual Significance of Cepstral Distortion Measures in Digital Speech Processing

Currently, one of the most widely used distance measures in speech and speaker recognition is the Euclidean distance between mel frequency cepstral coefficients (MFCC). MFCCs are based on filter bank algorithm whose filters are equally spaced on a perceptually motivated mel frequency scale. The value of mel cepstral vector, as well as the properties of the corresponding cepstral distance, are d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011